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Abstract — In this paper we present a method for automati- 
cally planning optimal paths for a group of robots that satisfy 
a common high level mission specification. Each robot's motion 
in the environment is modeled as a weighted transition system. 
The mission is given as a Linear Temporal Logic formula. 
In addition, an optimizing proposition must repeatedly be 
satisfied. The goal is to minimize the maximum time between 
satisfying instances of the optimizing proposition. Our method 
is guaranteed to compute an optimal set of robot paths. We 
utilize a timed automaton representation in order to capture 
the relative position of the robots in the environment. We 
then obtain a bisimulation of this timed automaton as a finite 
transition system that captures the joint behavior of the robots 
and apply our earlier algorithm for the single robot case 
to optimize the group motion. We present a simulation of a 
persistent monitoring task in a road network environment. 

I. Introduction 

Recently there has been an increased interest in using 
temporal logics to specify mission plans for robots [1], [2], 
[3], [4], [5]. Temporal logics are appealing because they 
provide a formal high level language in which to describe a 
complex mission. In addition, tools from model checking [6], 
[7] can be used to generate a robot path satisfying the 
specification, if such a path exists. However, frequently there 
are multiple robot paths that satisfy a given specification. 
In this case, one would like to choose the optimal path 
according to a cost function. The current tools from model 
checking do not provide a method for doing this. In our 
previous work [8] we considered Linear Temporal Logic 
(LTL) specifications, and a particular form of cost function, 
and provided a method for computing optimal robot paths 
for one robot. In this paper we extend this result to multiple 
robots. 

For simplicity of presentation, we assume that each robot 
moves among the vertices of an environment modeled as 
a graph. However, by using feedback controllers for facet 
reachability and invariance in poly topes [9], [10] the method 
developed in this paper can be easily applied for motion 
planning and control of robots with "realistic" continuous dy- 
namics (e.g., unicycle) traversing an environment partitioned 
using popular partitioning schemes such as triangulations and 
rectangular partitions. 

The main difficulty in moving from a single robot to mul- 
tiple robots is in synchronizing the motion of the robots, or in 
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allowing the robots to move asynchronously. In [11], the au- 
thors propose a method for decentralized motion of multiple 
robots subject to LTL specifications. In their approach, the 
robots take transitions (i.e., travel along edges in the graph) 
synchronously. Once every robot has completed a transition, 
the robots can synchronously make the next transition. While 
such an approach is effective for satisfying the LTL formula, 
it does not lend itself to optimizing the robot motion, 
since time is spent "waiting" for other robots. In [12], the 
authors take a different approach, representing the motion of 
the robots in the environment as a timed automaton. Each 
robot then has a continuous clock variable that describes 
its progress along a transition (i.e., a robot's position along 
an edge between two vertices). The authors then look at 
satisfying specifications given in computational tree logic 
(CTL). In this paper, we utilize a similar timed- automaton 
representation. However, we consider LTL specifications, 
for which the control synthesis problem is fundamentally 
different. In addition, rather than just satisfying the formulas, 
we optimize the motion of the robots. 

In terms of optimizing paths, the most closely related 
work has been on the vehicle routing problem (VRP) [13]. 
Recent results [14], [15] present extensions of vehicle routing 
problems to more general classes of temporal constraints. 
In [15], the authors consider vehicle routing with metric 
temporal logic specifications. The goal is to minimize a cost 
function of the vehicle paths (such as total distance traveled). 
The authors present a method for computing an optimal set 
of paths by converting the problem to a mixed integer linear 
program (MILP). While the approach is computationally 
intensive, it has been used to solve problems of real-world 
significance. However, their method does not apply to the 
persistent monitoring and data gathering applications that are 
of interest in this paper. In particular, it does not allow for 
specifications of the form "always eventually," which appear 
when specifying that a robot should repeatedly perform a 
task. In this paper we take an entirely different approach 
to optimizing robot motion, resulting in an optimization 
problem on a graph, rather than a MILP. 

The contribution of this paper is to present a method for 
generating optimal paths for a group of robots satisfying 
general LTL formulas. We focus on minimizing a cost 
function that captures the maximum time between satisfying 
instances of an optimizing proposition. The cost is motivated 
by problems in persistent monitoring and in pickup and deliv- 
ery problems. Our solution relies on describing the motion of 
the group of robots in the environment as a timed automaton. 
This description allows us to represent the relative position 
between robots. Such information is necessary for optimizing 
the robot motion. We provide a bisimulation [16] of the 



infinite-dimensional timed automaton to a finite dimensional 
transition system. From this point we are able to apply our 
previous results [8] to compute an optimal run. This run 
then maps to a path for each robot. We provide simulation 
results for robots in a road network environment. The main 
hurdle in our approach is the computational complexity. We 
discuss ways in which this can be reduced, and show that 
fairly complex problems can be solved under this framework. 
The organization of the paper is as follows. In Section [ll| 
we give some preliminaries. In Section [Ill| we formally 
state the motion planning problem for a team of robots, 
and in Section |IV] we present our solution. In Section |V| 
we present an experimental case study for a team of robots 
performing persistent data gathering missions in a road 
network environment. Finally, in Section VI we discuss 
some promising future directions. 

II. Preliminaries 

A. Transition Systems and LTL 

Definition II. 1 (Transition Systems). A (weighted) transi- 
tion system (TS) is a tuple T := {Qt-, Qt^ ~^t-, n, Ct-, wt), 
where 

(i) Qt is a finite set of states, 
(ii) q^ G Qt is the initial state, 
(iii) ^T^ Qt X Qt is the transition relation, 
(iv) n is a finite set of atomic propositions (observations), 
(v) Ct * Qt -^ 2^ is a map giving the set of all atomic 

propositions satisfied in a state, and 
(vi) wt :^t^ ^^ is a map that assigns a positive weight 
to each transition. 

We define a run of T as an infinite sequence of states tt = 
q^q^ . . . such that q^ = q^, q^ G Qt and {q^,q^^^) G^t 
for all /c > 1. A run generates an infinite word ujt = 
jC{q^)jC{q^) . . . where C{q^) is the set of atomic propositions 
satisfied at state q^. 

Definition II.2 (Formula of LTL). An LTL formula (f) over 
the atomic propositions U is defined inductively as follows: 

(/)::= T \a \ ^V^\ ^A^\ ^^\X^\ (l)U (j) 

where T is a predicate true in each state of a system, a G II 
is an atomic proposition, -■ (negation), V (disjunction) and 
A (conjunction) are standard Boolean connectives, and X 
and hi are temporal operators. 

LTL formulas are interpreted over infinite words (gener- 



operators. We say a run tt satisfies 
word generated by vt satisfies (/). 



if and only if the 



ated by the transition system T from Def. II. 1). Informally, 
X a states that at the next state of a word, proposition a is 
true; and aiU a2 states that there is a future moment when 
proposition 0^2 is true, and proposition ai is true at least until 
Of 2 is true. From these temporal operators we can construct 
two other temporal operators: Eventually (i.e., future), F 
defined as F^ := TU(j), and Always (i.e., globally), G, 
defined as G (/) := -1 F -k/). The formula G (j) states that (j) is 
true at all positions of the word; the formula F </) states that 
eventually becomes true in the word. More expressivity 
can be achieved by combining the temporal and Boolean 



Definition II.3 (Biichi Automaton). A Buchi automaton is a 
tuple B := (5,5o, S, (5, J-*), consisting of (i) a finite set of 
states S; (ii) a set of initial states So C S; (iii) an input 
alphabet S; (iv) a non-deterministic transition relation S C 
5 X E X 5; (v) a set of accepting (final) states T C S. 

A run of the Biichi automaton over an input word u = 
uj^uj^ ... is a sequence tb = s^s^ . . ., such that s^ G Sq, 
and (5^,cj^,5^+^) G S, for all k > 1. A Biichi automaton 
accepts a word over S if at least one of the corresponding 
runs intersects with J^b infinitely many times. For any LTL 
formula (j) over II, one can construct a Biichi automaton with 
input alphabet S C 2^ accepting all and only words over 
n that satisfy (/). We refer readers to [17] and references 
therein for efficient algorithms and freely downloadable 
implementations to translate a LTL formula over II to a 
corresponding Biichi automaton. 

B. Timed Automata 

A clock is a real-valued variable that increases at a rate of 
one as time progresses. Clocks may be valuated, or reset to 
zero. Let C denote a set of clocks. A clock valuation of some 
clock X e C, denoted as v{x), is a mapping from C to R>o 
that assigns a real value to each clock. A clock constraint g 
over a set of clocks C is formed according to the grammar 

g ::= X < c \ X < c \ X > c \ X > c \ g A g ^ 

where c G N is a constant and x G C is a clock. We let 
Q denote the set of all clock constraints over C. A clock 
valuation v{x) of some clock x satisfies a clock constraint 
g at some time iff g evaluates to true for v{x). 

Definition II.4 (Timed Automata). A timed automaton is a 
tuple A := {Qa, q^^A, Ga, ^a, n, Ca) where 
(i) Qa is a finite set of states, 
(ii) qA G Qa is an initial state, 
(iii) Ca is a finite set of clocks, 
(iv) Qa is ci finite set of clock constraints over Ca, 
(v) -^A^ Qa X Sa X 2^^ X Qa is the transition relation. 
A transition is a tuple {q^g^c^q') where q is the 
source state, q' is the destination state, g is the clock 
constraint that enables the transition, and c C Ca is 
the clock-resets, which is the set of clocks to be reset 
right after the transition. 
(vi) n is a finite set of atomic propositions, and 
(vii) Ca is a map assigning a subset of U to each transition 
of^A- 

The semantics of the timed automaton can be understood 
as follows: starting from the initial state q^, the values of all 
clocks increase at rate one, and the system remains at this 
state until a clock constraint corresponding to an outgoing 
transition is satisfied. When this happens, the transition is 
immediately taken and the clocks in the clock-resets are 
reset. The timed automaton from Def. III.4I can be seen as 
a particular case of the timed automaton defined in [18], 
which also allows for clock invariants associated with states. 



A timed automaton, as defined in Def. |II.4[ has a finite 
set of clock regions IZa, which is the set of equivalence 
classes of clock valuations induced by its clock constraints 
Qa- Intuitively, a clock region r G IZa is a subset of the 
infinite set of all clock valuations of Ca, in which all clock 
valuations are equivalent in the sense that the future behavior 
of the system is the same. In [18], it has been shown that 
a clock region can be either a comer point {e.g., (0,1)), an 
open line-segment {e.g., < xi = X2 < 1), or an open 
region {e.g., < xi < X2 < 1). The clock regions IZa of a 
timed automaton A induce an equivalence relation ~^ over 
its state space, and a simulation quotient, which we refer to 
as the region automaton R = A/ ~^. The region automaton 
R induced by this equivalence relation is a bisimulation 
quotient. To define R, we define a clock region r" to be 
the time-successor of a clock region r if and only if there 
is a t > such that all possible clock valuations in r are in 
clock region r" after time t. 

Definition II.5 (Region Automata). The region automaton 

H of a timed automaton A (Def. II.4) is a tuple R := 

{Qr^Qr^^r)^ ^here 
(i) Qr is the set of states of the form {q^r} such that 

q e Qa cind r e IZa, 
(ii) g^ is the initial state of the form {q%r^} such that 
q\ is the initial state of A and all clock valuations of 
r^ are zero, i.e., x^ = V x^ G r^, 

(iii) -^R is the transition relation such that there is a 
transition from {q^r} to {q' ^r'} if and only if there is 
a transition from q to q' in A and a clock constraint 
g in Qa <^nd a clock region r" such that: 

(a) r" is a time-successor of r, 

(b) r" satisfies the clock constraint g, and 

(c) r" goes to r' when corresponding clocks are reset 
once g is satisfied and the transition is made. 



III. Problem Formulation and Approach 



Let 



£ = (y,^s) 



(1) 



be a graph of the environment, where V is the set of vertices 
and ^^C ]/ X ]/ is a relation modeling the set of edges. 
In practice, £ can be the quotient graph of a partitioned 
environment, where F is a set of labels for the regions 
in the partition, and -^s is the corresponding adjacency 
relation. For example, V can be a set of labels for the roads, 
intersections, and buildings in an urban-like environment and 
-^s can show how these are connected (see Fig. [5|. 

Consider a team of m robots moving in an environ- 
ment modeled by £. The motion capabilities of robot i = 
{1, . . . , m} can be represented by a transition system (see 
Def.[lLT]) 

Ti = (Qi,^,^^i,n,A,^,), (2) 

where Qi C ]/; g? is the initial vertex of robot i\ -^i^^s 
is a relation modeling the capability of robot i to move 
among the vertices; II is a set of propositions assigned to the 
environment, which are assigned by Ci to robot z; Wi{q.,q') 
captures the time for robot i to go from vertex q io q' , and 



we assume that Wi{q., q') is always an integer. In this robotic 
model, robot i travels along the edges of T^, and spends 
zero time on the vertices. Note that we allow the assignment 
of propositions to differ for different robots to capture 
the possibly different capabilities of the robots to satisfy 
propositions in the environment. Also, in the definition of 
transition systems, each transition is deterministic, so any 
run on T^ can always be followed by robot i. 

We assume that there is an atomic proposition tt G II, 
called the optimizing proposition. We consider LTL formulas 
of the form 

^:=(/9AGF7r, (3) 

where Lp can be any LTL formula over II, and G F tt specifies 
that proposition tt must be satisfied infinitely often. In a 
persistent data gathering task, tt can be assigned to regions 
where new data is gathered, while (^ could be used to specify 
rules (such as traffic rules) that must be obeyed at all times 
during the task. 

We assume that each run r^ = q^q} • -- of a T^ (robot i) 
starts at t = and generates a word uji = uj^ujI . . . and an 
infinite sequence of time instances T^ := t^tj . . . such that 
ujf = Ci{qf) is satisfied at t^. In order to define the behavior 
of the team as a whole, we consider the sequences T^ as sets 
and take the union Ulii "^i ^^^ order this set in an ascending 
order to obtain T := t^t^^ — Then, we define u = u^u^ . . . 
to be the word generated by the team of robots where u^ 
is the union of all propositions satisfied at t^. Finally, we 
define the infinite sequence T^ = T^(l), T^(2), . . . where 
T^{k) stands for the time instance when tt is satisfied for the 
j^th ^jj^g ^y ^YiQ team. We can now formulate the problem: 

Problem III.l. Given a team of robots modeled as tran- 
sitions systems T^ and an LTL formula (j) in the form ([3]); 
Synthesize a run Vi for each robot in the team such that the 
word generated by the team satisfies (j) and T^ minimizes 



J(T^) = limsup (T^(i + 1) - T^(z)) . 



(4) 



i^--\-oo 



Note that a solution to Prob. IIII. ll minimizes the maximum 
time between satisfying instances of tt. Since we consider 
LTL formulas containing GFtt, this optimization problem 
is always well-posed. For the data gathering task previously 
mentioned, this translates to minimizing the maximum time 
in between two data gatherings. 

Our solution to Problem IlII. 1 1 can be outlined as follows: 
(i) For each transition system Ti,z = l,...,m, we 
obtain the dual transition system D^ where states and 
transitions are swapped and propositions are assigned 
to the transitions (Sec. |IV-A| ); 
(ii) For each dual transition system D^, we obtain a corre- 
sponding timed automaton A^. Each timed automaton 
consists of a single clock, which keeps track of the 
amount of time that a robot has been traveling between 
states in the original transition system T^ and we 
create a product timed automaton P as the parallel 
composition of A^,i = l,...,m timed automata 
(Sec. [IV-B] ); 



(iii) We obtain the region automaton R as the bisimulation 



(iv) 



quotient of P (Sec. [IV-C| ); 

We find the optimal run on R using the OPTIMAL- 
RUN algorithm we previously developed in [8]. We 
project this run back to the individual T^, i = 1, . . . , m 
to obtain the solution to Prob. |IIL1| (Sec. |IV-D| ). 



In this 
to Prob, 



IV. Problem Solution 

section, we explain each step of the solution 
in detail. For illustration, we use a simple 



III.l 



example throughout this section involving two robots in an 
environment consisting of three vertices. We present a multi- 
robot scenario in a more realistic setting in Sec. |V] 

A. Dual Transition Systems 

We proceed by converting the transition system T^ for 
each robot to a dual transition system D^. The dual D of 
a transition system T is obtained by swapping its states 
with its transitions. More precisely, given T = (Qt, Qt^ ~^t 
,U,Ct,wt), we define D = {Qd^Qd -^d,^Xd,u)d) as 
follows: if (a, 6) G^t, then ab G Qd, and {ab^bc) G^d. 
Intuitively, this means that the robot can "go from a to c 
through 6." As propositions are originally assigned to the 
states of T, they are satisfied on the transitions of D, i.e., if 
(ab^bc) G^D, then CD{{cib^bc)) = Crib). In addition, 
weights assigned to transitions of T are now defined on 
states of D, i.e., woiab) = wt{ci^ b). This means that in the 
dual D^ of a T^ time is spent on the vertices and transitions 
are instantaneous. Since the initial state q^ of T can have 
multiple outgoing transitions, the initial state g^ does not 
correspond to any transitions, therefore it has zero weight, 
but it connects to all outgoing transitions of q^. The duals 
of two simple transition systems are shown in Fig. [T] 




(c) 



(d) 



Fig. 1: (a) and (b) show the transition systems Ti and T2 for two 
robots in an environment with three vertices. The states of the transition 
systems correspond to vertices {a, 6, c} and the edges represent the motion 
capabilities of each robot. The weights of the edges represent the time 
needed to traverse from a state to another; (c) and (d) are the dual transition 
systems Di and D2 corresponding to Ti and T2, respectively. A state 
labelled ah means that the robot is travelling from vertex a to h. 



B. Construction of the timed automata 

By constructing the duals of the original transition systems 
of individual robots, we can now fully capture the evolution 
of time for each robot taking transitions on T^ with a timed 



automaton as defined in Def. II.4 We can then generate a 
product timed automaton capturing the time evolution of the 
whole team. 

To this end, for each robot, we define a clock xi, which 
records how much time has passed in each state of D^. We 
interpret the weights on the states of D^ as clock constraints, 
i.e., each state ab in D^ is associated with a clock constraint 
v{xi) > wria^b). We set the initial value of the clock for 
each robot to 0, and we let the clock constraint for the 
initial state of D^ to be immediately satisfied. At each state, 
once the clock constraint is satisfied, it triggers an outgoing 
transition and clock Xi is reset to 0. As mentioned before in 
Def. |II.4| we enforce a transition when a clock constraint is 
satisfied. We denote the timed automaton corresponding to 
D^ as A^. The timed automata corresponding to the D^'s in 
Fig. [T] are illustrated in Fig. |2] 

xi =0 




Fig. 2: Timed automata Ai and A 2 of each robot, corresponding to Di 
and D2 shown in Fig. [ic] and Fig. Qd] respectively. The equations next to 
each arrow represents the clock constraint and the clock-reset associated 
with each transition of the timed automaton. 



We capture the joint behavior of the robots by taking the 
parallel composition of the individual timed automata A^, 
i = l,...,m, and calling it the product timed automata 
P. The set of states of P is the Cartesian product of the 
set of states of Di, i G {l,...,m}. The initial state of 
P is {q^_^, . . . ^q^^). We enable a transition from state 
{Qi^ • ' • ^Qm) to {q[, ' ' ' ,q^m) if ^^^ ^^^Y if' ^^^ ^11 ^' either 
{qi,gi,Ci,q[) e^Ai, or if {qi,gi,Ci,q[) ^^a, for some 
i, then qi = q[. We label this transition with the union 
of propositions satisfied by the corresponding transitions in 
-^D^^ and similarly the clock constraints that enable this 
transition are the union of all clock constraints gi associated 
with the transitions that are taken and inverses of the clock 
constraints associated with the remaining transitions that are 
not taken. Moreover, the clocks are reset for all robots i that 
transitioned to a new state q[. We require that at least one 
robot i makes a transition to a new state for each transition of 



p. Since we enforce each transition to be taken immediately 
when all clock constraints are satisfied, some transitions of 
P may never be taken because they are always preceded by 
some other transitions for all possible clock values. Such 
transitions will be referred to as invalid transitions. For the 
example given in Fig. [T] and Fig. |2] we show the resulting 
product timed automaton P = Ai x A2 in Fig. |3] (without 
invalid transitions). 




Ai X A2 



Fig. 3: The product timed automaton P describing the motion of the two 
robots. The state ab, ba denotes that Ai is in state ab and A2 is in state ba. 
To avoid notation clustering, we do not show the clock-resets and invalid 
transitions. For example, in the transition from state (ba,bc) to (ba,cb), 
robot 2 completes a transition, so its clock is reset, while robot 1 does not 
complete a transition, the state stays the same and the clock is not reset. The 
transition from (ba, be) to (ab, be) is invalid, because it can never happen 
before the transition from (ba, be) to (ba, eb). 



C. Construction of the Region Automaton 

From the product timed automaton P, we can obtain the 
region automaton R as a bisimulation quotient of P (see Sec. 
II-B| ). Note that the bisimulation quotient we obtain from P 
is a particular case of the bisimulation quotient of a general 
timed automaton, where the transitions are enforced when 
clock constraints are satisfied. In the process of obtaining 
R, all invalid transitions of P are automatically removed, 
by the definition of region automata. 

We can now assign propositions and weights to R, con- 



verting it to a transition system as defined in Def. II. 1 We 
define a function Cr : Qr -^ 2^ such that, for each transi- 
tion ({^,r}, {g',r'}), the set of propositions corresponding 
to the transition {q^g^c^q') on P are assigned to the state 
q' , i.e., observations defined on the transitions of P are 
carried to their destination states in R. In the following, 
we take m to be the number of clocks, or equivalently 
the number of robots, in the product timed automaton P, 
and di to be the largest integer constant that some clock 
Xi ^Cp = {xi, . . . , Xm} is compared with. 

Proposition IV. 1. For each state {q^r}ofthe region automa- 
ton R, clock region r is always a tuple {v{xi), . . . , v{xm)), 
where v{xi) are integers for all i = 1, . . . ,m. 

Proof. Clock constraints are positive integers smaller than 
or equal to di. Since the transitions are enforced when clock 
constraints are satisfied, and the initial clock is set to 0, 
every time a transition on P is taken, after the clock-resets, 
we have v{xi) G {0, . . . , c^^ — 1}, for all z = 1, . . . , m. 



Therefore, the set of clock regions that can be reached on 
R (the bisimulation quotient of P) are always corner points, 
i.e., a tuple (v(xi), . . . , v{xm)), where v{xi) are integers for 
alH = 1, . . . ,m. ■ 



Using Prop. |IV.1[ we now assign a weight to each tran- 
sition of R. Given a transition ({^, r}, {g^, r'}), we define 
its weight to be the time t it takes to reach from r = 

{v{xi), . . . , v{Xm)) to r'' = {v{xi) + t, . . . , v{Xm) + t), 

where r" is a time-successor of r. The region automaton 
corresponding to the product automaton from Fig. [3] is shown 
in Fig. [4] 




Fig. 4: The finite state region automaton capturing the joint behavior of two 
robots in 9 states. In the circle representing a state {q,r}, the first line is 
q and the second line is r. 

The following proposition gives the bound on the size of 
the region automaton R. 

Proposition IV.2. The number of states \Qr\ of R is 
bounded by 



\Qp\[I[d^-W,-i) 



(5) 



\i=l 



i=l 



Proof From Prop. IV. 1 all clock regions of R are comer 
points, i.e., tuples of integers taking values within the range 
{0,...,(ii — 1}. Counting the number of possible reachable 
clock regions, we have 

m m 

where fli^i ^i i^ the number of all possible corner points 
and nl^il^^i — 1) is the number of corner points where all 
clocks are non-zero (since one clock must be zero after the 
reset, these corner points cannot be reached). Given a product 
timed automaton with | Qp \ number of states, using the above 
given bound on the number of reachable clock regions we 
can conclude that |Qi^| is bounded by ([5]). ■ 

Remark IV.3. In [18] the authors give the upper-bound on 
the number of clock regions \Tlp\ ofP as 

m 

m!-2^-]^(2d, + 2), 

i=l 

which gives the upper bou nd o f R as \Qp\ • m\ • T^ • 
111^1(2^* + 2). From Prop. IV.l using our particular case 



of timed automata, \ Qr \ is reduced by at least a factor of 

ml • 2^^. 



We use Alg. [T] to obtain the region automaton R, by 
applying a (recursive) depth-first search (DFS) on P. We 
note that fine 8 in Alg. [T] removes all invalid transitions in 
P. Moreover, Alg. [T] generates R by finding all reachable 
clock regions of P. 

Algorithm 1: Obtain-Region-Automaton 

Input: Product timed automaton P. 

Output: Corresponding region automaton R. 
1 begin 

Obtain R by running a DFS on P starting from the 
initial state and clock region r^ = (0, . . . , 0): 



3 
4 

5 

6 

7 
8 

9 
10 
11 

12 
13 

14 
15 



dfsP(q%,r^). 



Function dfsP(state q, clock region r) 



begin 



Find the next clock region r^' when we have a 

transition out of q. 

w ^ Time between r and r'^ 

foreach transition t taken at r" do 

Find the next clock region r' once t is taken by 
resetting the appropriate clock. 
q' ^ Target state of t. 
if {g^r^}^Qi^then 

Add state {q' ^r'} to Qr with proposition 
Cp{t) oft. 

Add {q,r} -^ {<7^r'} to ^r with w. 
Continue search from {q'^r'}: dfsP{q'^r') 

else if {q,r} -^ {q\r'} ^->r then 
^Add {q,r} -^ {q' ,r'} to ^r with w. 



We now show that the region automaton indeed captures 
the behavior of the team. Given a run tr on R, we 
denote the corresponding word (see Sec. II-A| ) as ujr and 
the corresponding time sequence of satisfying instances of 



propositions (see Sec. Ill) as T^^. We have 



Proposition IV.4. Given individual runs of the team, r^ = 
qfq} . . . ^i = 1, . . . , r?i, there is a corresponding run tr on 
R such that, the word uj generated by the team is ujr and 
the time sequence T of satisfying instances of propositions 
for the team is Tr. 

Proof. Each run r^ = q^qj . . . uniquely corresponds to a 
run on D^, td, = q%^{q^q}){q}qj), . . .. Since the weight 
WTi [q^ 1 04^^) is defined to be the clock constraint associated 
with state q^q^^^ on A^, there is a sequence of transitions 
rp on the product timed automaton P such that a transition 
occurs if some set of states are visited on T^'s. Since R 
is a bisimulation quotient of P, this sequence of transitions 
corresponds to a run on vr = ^^{^^, r^}{^^, r^} . . ., such 



that each transition {{q •,r^}-,{q 



/e+l ^/c+1 



}) in Tr corre- 



sponds to some set of states being visited on T^'s, which we 
denote as I{{q^ ^ r^}, {g^+^, r^+^}). Similarly, if some set of 
states are visited on T^'s, there is a corresponding transition 
({g^, r^}, {g'^+^, r^+^}) for some k. The set of propositions 



fied when the transition ({g^, r^}, {g^+^, r^+^}) is taken and 
the state {g^+^,r^+^} is reached on R. Therefore the word 
uo generated by R is exactly the word generated by the team. 
Also note that the state {g^+^,r^+^} corresponds to robots 
leaving vertices /({g^, r^}, {g^+^, r^+^}). Because robots 
spend zero time at vertices, {g'^+^,r^+^} is reached at the 



same time as when robots reach /({g^,r^}, {q 



/c+l ^/c+1 



})• 



Therefore, the time sequence T of satisfying instances of 
propositions for the team is exactly T^^ for run vr. ■ 

D. Generating the optimal runs for individual robots 

Once the region automaton capturing the behavior of the 
team is constructed, we can use Alg. Optimal-Run [8] to 
obtain an optimal run r^ on R that minimizes the limsup 
as defined in (|4]). The optimal run r\ always consists of a 
finite sequence of states of R (prefix), followed by infinite 
repetitions of another finite sequence of states of R (suffix). 
Such a run is said to be in a prefix-suffix form. 

For the example we have shown throughout this section, 
running Alg. OPTIMAL-RUN [8] on R given in Fig. |4] for 
the formula (j) := GFtt results in the optimal run 
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where the first row corresponds to the times when transitions 
occur, the second row comprises the run r^, and the last row 
shows the satisfying atomic propositions. For this run, we 
see that {{ab, ah), (0, 0)}{(6a, he), (0, 0)}{(6a, ch), (1, 0)} is 
the prefix and {{ah, ha), (0, 0)}{(6a, ah), (0, 0)} is the suffix 
and will be repeated infinite number of times. Moreover, for 
this example, the time sequence of satisfaction of tt is T^ = 
2, 4, 6, 8, 10, . . . and the cost as defined in ^ is J(T^) = 2. 
Given a run vr of R, we can finally project it down to 
individual robots to obtain individual runs r^ of T^. 

Definition IV.5 (Projection of a run on R to T^'s). Given 
a run vr on R where 

TR = {{qWl.---.(t(L).{v\x,),...,v\xrn))] 

{{qUh • • • , (llnOL)^ {V^{xi), ..., V^{Xm))] • • • , 

we define its projection on T^ as run Vi = qfq} . . . for all 
i = 1, . . . ,m, where qf only appears in vi if v^{xi) = 0. 

It can be easily seen that, given tr, its set of projected 
runs Ti correspond to tr as defined in Prop. IV.4| i.e., the 
behavior of the team where robot i follows run r^ is captured 
exactly by tr. Moreover, if run tr is in prefix- suffix form, all 
individual runs r^ projected from tr are in prefix-suffix form. 
Therefore, the individual runs projected from the optimal run 
r^ are always in prefix-suffix form. For the optimal run we 
obtained for the previous example, using Def . |IV.5| we have 
runs of individual robots as follows: 
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satisfied at the set of states I{{q^,r^},{q' 



fc+i ^fc+i 



}) is satis- 



Note that, at time t 

while the first robot is still traveling from 6 to a, therefore the 



clock of the first robot is not zero at this time, i.e., v^{xi) ^ 
0, and h does not appear in t\ at time t = 3. 

We finally summarize our approach in Alg. |2] and show 
that this algorithm indeed gives a solution to Prob. |III.l 



Proposition IV.6. Alg. |2] solves Prob. \III.1 



Proof. Note that Alg. [2] combines all steps outlined in this 
section. Run r^ obtained from Alg. OPTIMAL-RUN both 
satisfies (f) and minimizes ^ among all runs of R, which 
was shown in [8]. As shown in Prop. |IY4] and as mentioned 
above, there is a one-to-one correspondence between a set of 
runs {ri, . . . , rm} and a run tr. Therefore, {r^, . . . , r^} as 



a projection of r^ onto T^'s is the solution to Prob. III.l 



Algorithm 2: Multi-Robot- Optimal-Run 
Input: m T^'s and a LTL specification (j) of form ([3]). 
Output: A set of runs {rjf , . . . , rj^} that both satisfies 
(j) and minimizes ^. 

1 begin 

2 forall the T^ do 

3 Construct the timed automaton Ai by first 
constructing the dual TS Di and then defining 
clocks and clock constraints. 



Find the product timed automaton P = W^-^Ai. 
Construct the region automaton R using 
Obtain-Region-Automaton . 
Find the optimal run r^ using OPTIMAL-RUN [8]. 
Project r^ to T^'s to obtain runs {^i , • • • , ^J^}. 



V. Implementation and Case Studies 

We implemented Alg. |2] in objective-C as the software 
package LTL Optimal Multi-robot Planner (LOMP) 
and used it in conjunction with our earlier OPTIMAL-RUN 
[8] algorithm to obtain simulations of robots performing 
persistent data gathering missions in a road network en- 
vironment. Our user-friendly software package is available 
at http : //hyness . bu . edu/Sof tware . html . It uti- 
lizes the dot tool [19] to visualize transition systems and the 
Optimal-Run algorithm uses the LTL2BA software [20] 
to convert LTL specifications to Btichi automata. A typical 
usage of our software comprises three steps: 
(i) The user defines T^'s in text and imports them to the 
application. Then, the application creates the region 
automaton R following the steps detailed in Sec IV 
and exports an M-file which defines R in Matlab. 
(ii) Optimal-Run algorithm is executed in Matlab to find 
the optimal run r^ on R, which is projected onto 
Ti, z = 1, . . . , m to obtain the solution to Prob. |III.1| 
(iii) Finally, the resulting motion of the team is shown in 
a simulator. 
The road network that we consider for our case studies 
is a collection of roads, intersections, and task locations. In 
this road network, a road connects two intersections and the 
task locations are always located on the side of a road. The 
transition system that we used to model the motion of the 
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Fig. 5: The road network showing the labels of task locations and the 
quantized weights of the road segments for the two case studies. Values 
in blue are weights for the case where the weights are in {1 . . . 20} and 
values in magenta are for the case where the weights are in {1 ... 5}. 



robots in this environment is illustrated in Fig. [5] We assume 
that the transition systems T^ of robots are identical except 
at the initial state. In T^'s, the weights of transitions are 
quantized so that the resulting region transition system has a 
manageable size while still preserving the relative distances 
of the road segments. In the following, we consider two 
cases where the weights fall in the range {1,...,5} and 
{1, . . . , 20}, respectively. 

We consider a persistent monitoring task where robots are 
deployed to repeatedly gather and upload data. We require 
robot 1 to gather data at PI and upload the gathered data at 
P5; and robot 2 to gather data at P2 and upload the gathered 
data at P4. To specify this task, we let the set of atomic 
propositions to be 

n = {Gather, RlGather,RlUpload,R2Gather,R2Upload} 

and assign the atomic propositions as follows: 

>Ci(Pi) = {RlGather, Gather}, /:i(P5) = {RlUpload} 
jC2{P2) = {R2Gather, Gather}, £2(^4) = {R2Upload}. 

We aim to minimize the maximum time in between data- 
gatherings performed by either robot 1 or 2. Therefore we 
set the proposition Gather to be satisfied when either robots 
visit their gathering locations, and we set it as the optimizing 
proposition (tt as in formula ([5])). We set the propositions 
{RlGather, RlUpload} and {R2Gather, R2Upload} to be 
robot specific since robots gather and upload at different 
locations. For both robots, we enforce the rule that, after 
each data gathering, the data must be uploaded at the upload 
location before another data gathering. This rule can be 
specified in LTL as follows: 

cp = G(RlGather => X(^RlGather U RlUpload)) 
A G(R2Gather => X(^R2Gather U R2Upload)). 

Our overall LTL formula in the form of ([3]) is ^ = (/:? A 
G F Gather. 

Running our algorithms on an iMac 15 quad-core com- 
puter, we obtain the solutions as illustrated in Fig. [6] For 
the case where the weights are in the range {1 ... 5} the 
algorithm ran for 90 seconds, the region transition system 





(a) 



(b) 



Fig. 6: Simulated team trajectories for the two case studies, (a) and (b) 
correspond to the cases where the weights are within the ranges {1 ... 5} 
and {1 ... 20}, respectively. Robot 1 and robot 2 travel between red and 
blue task locations respectively. Regions filled with a solid color are data 
gathering locations and regions with a diagonal pattern are upload locations. 



R that the Optimal-Run algorithm worked on had 2337 
states and the value of the cost function was 11 time units, 
meaning that the maximum time in between data gatherings 
was 11 time units. For the case where the weights are in 
the range {1 ... 20} our algorithm ran for 10 minutes, R 
had 6191 states and the value of the cost function was 22 
time units. Our video submission accompanying the paper 
displays the robot trajectories for both cases. 

It is interesting to note that, for the case where the weights 
are in {1 ... 20}, the optimal team trajectories have robots 
spending extra time entering and exiting some vertices. This 
behavior is actually time- wise optimal since it decreases the 
maximum time between satisfying instances of the optimiz- 
ing proposition, minimizing the cost function. 

VI. Conclusions 

In this paper we presented a method for planning the 
optimal motion for a team of robots in a common envi- 
ronment subject to temporal logic constraints. The problem 
is important in applications where multiple robots have to 
perform a sequence of operations collectively subject to 
various external constraints. We considered temporal logic 
specifications which contain an optimizing proposition that 
must be repeatedly satisfied. The motion plan that our 
method provides is optimal in the sense that it minimizes 
the maximum time between satisfying instances of the opti- 
mizing proposition. 

There are many promising directions for future work. In 
particular, we are looking at the case where one allows delays 
when robots take transitions. We are also investigating more 
realistic robot models such as Markov Decision Processes 
(MDPs) and Partially Observable MDPs. 
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